We distinguish two basic characteristics:
R-specific)father mother name age gender
John 33 male
Julia 32 female
John Julia Jack 6 male
John Julia Jill 4 female
John Julia John jnr 2 male
David 45 male
Debbie 42 female
David Debbie Donald 16 male
David Debbie Dianne 12 female
father mother name age gender
John 33 male
Julia 32 female
John Julia Jack 6 male
John Julia Jill 4 female
John Julia John jnr 2 male
David 45 male
Debbie 42 female
David Debbie Donald 16 male
David Debbie Dianne 12 female
father mother name age gender
John 33 male
Julia 32 female
John Julia Jack 6 male
John Julia Jill 4 female
John Julia John jnr 2 male
David 45 male
Debbie 42 female
David Debbie Donald 16 male
David Debbie Dianne 12 female
dateRep,day,month,year,cases,deaths,countriesAndTerritories,geoId,countryterritoryCode,popData2019,continentExp,Cumulative_number_for_14_days_of_COVID-19_cases_per_100000 14/10/2020,14,10,2020,66,0,Afghanistan,AF,AFG,38041757,Asia,1.94523087 13/10/2020,13,10,2020,129,3,Afghanistan,AF,AFG,38041757,Asia,1.81116766 12/10/2020,12,10,2020,96,4,Afghanistan,AF,AFG,38041757,Asia,1.50361089
<records> <record> <dateRep>14/10/2020</dateRep> <day>14</day> <month>10</month> <year>2020</year> <cases>66</cases> <deaths>0</deaths> <countriesAndTerritories>Afghanistan</countriesAndTerritories> <geoId>AF</geoId> <countryterritoryCode>AFG</countryterritoryCode> <popData2019>38041757</popData2019> <continentExp>Asia</continentExp> <Cumulative_number_for_14_days_of_COVID-19_cases_per_100000>1.94523087</Cumulative_number_for_14_days_of_COVID-19_cases_per_100000> </record> <record> <dateRep>13/10/2020</dateRep> ... </records>
<records> <record> <dateRep>14/10/2020</dateRep> <day>14</day> <month>10</month> <year>2020</year> <cases>66</cases> <deaths>0</deaths> <countriesAndTerritories>Afghanistan</countriesAndTerritories> <geoId>AF</geoId> <countryterritoryCode>AFG</countryterritoryCode> <popData2019>38041757</popData2019> <continentExp>Asia</continentExp> <Cumulative_number_for_14_days_of_COVID-19_cases_per_100000>1.94523087</Cumulative_number_for_14_days_of_COVID-19_cases_per_100000> </record> <record> <dateRep>13/10/2020</dateRep> ... </records>
<records> <record> <dateRep>14/10/2020</dateRep> <day>14</day> <month>10</month> <year>2020</year> <cases>66</cases> <deaths>0</deaths> <countriesAndTerritories>Afghanistan</countriesAndTerritories> <geoId>AF</geoId> <countryterritoryCode>AFG</countryterritoryCode> <popData2019>38041757</popData2019> <continentExp>Asia</continentExp> <Cumulative_number_for_14_days_of_COVID-19_cases_per_100000>1.94523087</Cumulative_number_for_14_days_of_COVID-19_cases_per_100000> </record> <record> <dateRep>13/10/2020</dateRep> ... </records>
The actual content we know from the csv-type example above is nested between the ‘records’-tags:
<records> ... </records>
There are two principal ways to link variable names to values.
<variable>Monthly Surface Clear-sky Temperature (ISCCP) (Celsius)</variable>
<filename>ISCCPMonthly_avg.nc</filename>
<filepath>/usr/local/fer_data/data/</filepath>
<badflag>-1.E+34</badflag>
<subset>48 points (TIME)</subset>
<longitude>123.8W(-123.8)</longitude>
<latitude>48.8S</latitude>
<case date="16-JAN-1994" temperature="9.200012" />
<case date="16-FEB-1994" temperature="10.70001" />
<case date="16-MAR-1994" temperature="7.5" />
<case date="16-APR-1994" temperature="8.100006" />
<filename>ISCCPMonthly_avg.nc</filename>.<case date="16-JAN-1994" temperature="9.200012" />.Attributes-based:
<case date="16-JAN-1994" temperature="9.200012" />
<case date="16-FEB-1994" temperature="10.70001" />
<case date="16-MAR-1994" temperature="7.5" />
<case date="16-APR-1994" temperature="8.100006" />
Tag-based:
<cases>
<case>
<date>16-JAN-1994<date/>
<temperature>9.200012<temperature/>
<case/>
<case>
<date>16-FEB-1994<date/>
<temperature>10.70001<temperature/>
<case/>
<case>
<date>16-MAR-1994<date/>
<temperature>7.5<temperature/>
<case/>
<case>
<date>16-APR-1994<date/>
<temperature>8.100006<temperature/>
<case/>
<cases/>
XML:
<person>
<firstName>John</firstName>
<lastName>Smith</lastName>
<age>25</age>
<address>
<streetAddress>21 2nd Street</streetAddress>
<city>New York</city>
<state>NY</state>
<postalCode>10021</postalCode>
</address>
<phoneNumber>
<type>home</type>
<number>212 555-1234</number>
</phoneNumber>
<phoneNumber>
<type>fax</type>
<number>646 555-4567</number>
</phoneNumber>
<gender>
<type>male</type>
</gender>
</person>
XML:
<person>
<firstName>John</firstName>
<lastName>Smith</lastName>
<age>25</age>
<address>
<streetAddress>21 2nd Street</streetAddress>
<city>New York</city>
<state>NY</state>
<postalCode>10021</postalCode>
</address>
<phoneNumber>
<type>home</type>
<number>212 555-1234</number>
</phoneNumber>
<phoneNumber>
<type>fax</type>
<number>646 555-4567</number>
</phoneNumber>
<gender>
<type>male</type>
</gender>
</person>
JSON:
{"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"phoneNumber": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "fax",
"number": "646 555-4567"
}
],
"gender": {
"type": "male"
}
}
XML:
<person> <firstName>John</firstName> <lastName>Smith</lastName> </person>
JSON:
{"firstName": "John",
"lastName": "Smith",
}
HyperText Markup Language (HTML), designed to be read by a web browser.
HTML documents/webpages consist of ‘semi-structured data’:
<!DOCTYPE html>
<html>
<head>
<title>hello, world</title>
</head>
<body>
<h2> hello, world </h2>
</body>
</html>
In this example, we look at Wikipedia’s Economy of Switzerland page.